Continuous Benchmarking #936

Malmahrouqi3 · 2025-07-11T20:35:22Z

User description

Description

Concerning (#462),
Intended to keep track of benchmark results (./mfc.sh bench) for performance-critical improvements. Since there is not a specific benchmark procedure, the four existing MFC benchmark cases' results are reported. To ensure standardized performance with no hardware-bias, all benchmarking occurs on a GitHub runner till figured later what resources/clusters/allocations/runners to utilize. Once poc is finalized, other stuff ought to be easy.

Debugging info,
Not much besides reviewing .md pages.

To-dos,

Proper dump of results into .json file(s) as per google benchmark framework requirements.
GitHub Page with live charts informing of performance details.
If the framework is too glitchy/complicated, having .json file(s) while continuously appending datapoints then plotting points from there should not be rather complex to implement I guess.

Note to Self:
Look into retrospectively record the previous 10-50 base repo commits to display invaluable datapoints.

PR Type

Enhancement

Description

Implement continuous benchmarking with GitHub Action workflow
Remove legacy cluster-specific benchmark scripts
Add Google Benchmark format conversion for MFC results
Create automated performance tracking and documentation

Changes diagram

flowchart LR
  A["Legacy Benchmark Scripts"] -- "removed" --> B["Continuous Benchmarking"]
  B --> C["GitHub Action Workflow"]
  C --> D["MFC Benchmark Execution"]
  D --> E["Google Benchmark Format"]
  E --> F["Performance Tracking"]
  F --> G["Documentation Generation"]

Changes walkthrough 📝

Relevant files

Enhancement

21 files

cont-bench.yml `Add continuous benchmarking GitHub Action workflow`	+149/-0
bench.yml `Remove legacy benchmark workflow`	+0/-110
bench.sh `Remove Frontier cluster benchmark script`	+0/-16
build.sh `Remove Frontier cluster build script`	+0/-17
submit-bench.sh `Remove Frontier cluster benchmark submission script`	+0/-54
submit.sh `Remove Frontier cluster submission script`	+0/-55
test.sh `Remove Frontier cluster test script`	+0/-10
bench.sh `Remove Phoenix cluster benchmark script`	+0/-27
submit-bench.sh `Remove Phoenix cluster benchmark submission script`	+0/-64
submit.sh `Remove Phoenix cluster submission script`	+0/-64
test.sh `Remove Phoenix cluster test script`	+0/-21
cleanliness.yml `Remove code cleanliness workflow`	+0/-127
coverage.yml `Remove coverage check workflow`	+0/-48
docs.yml `Remove documentation workflow`	+0/-76
formatting.yml `Remove formatting workflow`	+0/-19
line-count.yml `Remove line count workflow`	+0/-54
lint-source.yml `Remove source linting workflow`	+0/-55
lint-toolchain.yml `Remove toolchain linting workflow`	+0/-17
pmd.yml `Remove PMD source analysis workflow`	+0/-131
spelling.yml `Remove spell check workflow`	+0/-17
test.yml `Remove test suite workflow`	+0/-131

Tests

5 files

test-components.sh `Add component testing script for benchmarks`	+48/-0
bench `Add benchmark results YAML file`	+153/-0
bench-google.json `Add Google Benchmark format JSON results`	+69/-0
bench.json `Add benchmark results JSON file`	+195/-0
bench.yaml `Add benchmark results YAML file`	+153/-0

Configuration changes

1 files

.env `Add environment configuration with GitHub tokens`	+2/-0

Documentation

1 files

cont-bench.md `Add continuous benchmarking documentation`	+154/-0

Need help?
Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
Check out the documentation for more information.

qodo-merge-pro · 2025-07-11T20:36:13Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis 🔶 462 - Partially compliant Compliant requirements: • Use benchmark-action/github-action-benchmark to track benchmark CI results over time • Implement continuous performance monitoring Non-compliant requirements: • Keep track of benchmark results that are currently only tracked in PR CI Requires further human verification: • Verify that the GitHub Pages deployment works correctly for displaying benchmark charts • Confirm that the benchmark data format conversion produces accurate results • Test that the alert system functions properly when performance degrades
⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 Security concerns Sensitive information exposure: The .env file contains hardcoded GitHub personal access tokens in plain text. These tokens provide repository access and should never be committed to version control. The tokens should be stored as GitHub repository secrets and accessed via ${{ secrets.GITHUB_TOKEN }} in the workflow instead.
⚡ Recommended focus areas for review Security Risk GitHub personal access token is hardcoded in plain text in the .env file, which poses a serious security vulnerability if committed to the repository. TOKEN=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 GITHUB_TOKEN=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 Possible Issue The workflow references 'bench.json' file in the Python conversion script but the actual benchmark output is 'bench.yaml'. This mismatch will cause the conversion step to fail. with open('bench.json', 'r') as f: mfc_data = json.load(f) Logic Error The workflow runs on all push/PR events but only processes simulation_time metrics, ignoring grind_time data that exists in the MFC output, leading to incomplete performance tracking. if 'simulation' in output_summary and 'exec' in output_summary['simulation']: benchmarks.append({ "name": f"{case_name}/simulation_time", "family_index": len(benchmarks), "per_family_instance_index": 0, "run_name": f"{case_name}/simulation_time", "run_type": "iteration", "repetitions": 1, "repetition_index": 0, "threads": 1, "iterations": 1, "real_time": output_summary['simulation']['exec'] * 1e9, "cpu_time": output_summary['simulation']['exec'] * 1e9, "time_unit": "ns" })

qodo-merge-pro · 2025-07-11T20:37:00Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Security	Remove hardcoded GitHub tokens The `.env` file contains hardcoded GitHub personal access tokens which pose a critical security risk. These tokens should never be committed to version control as they provide access to GitHub repositories and can be exploited by malicious actors. .env [1-2] -TOKEN=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 -GITHUB_TOKEN=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 +# Example environment variables - replace with your actual values +# TOKEN=your_github_token_here +# GITHUB_TOKEN=your_github_token_here Suggestion importance[1-10]: 10 __ Why: This is a critical security vulnerability, as committing hardcoded personal access tokens to version control can lead to unauthorized access to GitHub resources.	High
Possible issue	Install GitHub CLI dependency The `gh auth token` command will fail because the GitHub CLI is not installed in the setup step. This will cause the workflow to fail when trying to export the `TOKEN` variable. .github/workflows/cont-bench.yml [41-49] - name: Setup run: \| sudo apt update -y sudo apt install -y cmake gcc g++ python3 python3-dev hdf5-tools \ - libfftw3-dev libhdf5-dev openmpi-bin libopenmpi-dev + libfftw3-dev libhdf5-dev openmpi-bin libopenmpi-dev gh export TOKEN=$(gh auth token) sudo wget -qO /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 sudo chmod +x /usr/local/bin/yq yq --version Suggestion importance[1-10]: 9 __ Why: The workflow will fail because it attempts to use the `gh` command without installing the GitHub CLI first, which is a blocking bug for this CI job.	High
Possible issue	Fix benchmark file path The Python script attempts to read `bench.json` from the current directory, but the file is actually located in the `pr/` subdirectory based on the previous step. This will cause a FileNotFoundError. .github/workflows/cont-bench.yml [57-65] - name: Convert MFC to Google Benchmark Format run: \| python3 << 'EOF' import json from datetime import datetime # Read the MFC benchmark data - with open('bench.json', 'r') as f: + with open('pr/bench.json', 'r') as f: mfc_data = json.load(f) Suggestion importance[1-10]: 9 __ Why: The script will fail with a `FileNotFoundError` because it tries to open `bench.json` in the wrong directory, which is a blocking bug for this CI job.	High
Update

Copilot

Pull Request Overview

This PR introduces continuous benchmarking for performance-critical workflows, including a new test script, documentation updates, sample benchmark data, and a dedicated GitHub Actions workflow. It also removes existing CI workflows in favor of a single continuous benchmarking pipeline.

Added test-components.sh to validate component execution and JSON conversions.
Generated documentation docs/documentation/cont-bench.md with sample benchmark results.
Introduced bench.yaml/json, bench-google.json, and .github/workflows/cont-bench.yml to automate benchmark collection and Google Benchmark conversion.
Note: an .env file with sensitive tokens was added and multiple legacy workflows were removed.

Reviewed Changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
test-components.sh	Script for testing component execution and JSON YAML
docs/documentation/cont-bench.md	Documentation page for continuous benchmarking
bench.yaml / bench.json / bench-google.json	Sample benchmark data
.github/workflows/cont-bench.yml	New continuous benchmarking workflow
.env	Environment file with sensitive tokens
.github/workflows/*.yml (other CI workflows)	Legacy workflows deleted

Comments suppressed due to low confidence (4)

.github/workflows/cont-bench.yml:64

The Python script reads 'bench.json' in the root, but the workflow converts and writes to 'pr/bench.json'. Update the path to match where the file is written.

          with open('bench.json', 'r') as f:

bench.yaml:5

Avoid hard-coded absolute paths; use relative paths or environment variables to improve portability.

      path: /home/mohammed/Desktop/cont-bench/benchmarks/5eq_rk3_weno3_hllc/case.py

.env:1

The .env file commits personal access tokens, exposing secrets. Remove it from version control and use GitHub Secrets instead.

TOKEN=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

docs/documentation/cont-bench.md

bench.json

test-components.sh

Malmahrouqi3 · 2025-07-11T20:37:26Z

Posted my GitHub account tokens, I will just disable them rn.

sbryngelson · 2025-07-12T14:18:40Z

You can put back your workflow files. You will want to mimic the setup of the new bench.sh (roughly). I had to remove a couple of GitHub actions like store artifacts on Phoenix because the new SOCKS5 proxy doesn't allow for them (so far as I've figured out so far. I'm worried that all of your new commands here will cause some issues, but we shall find out.

Malmahrouqi3 and others added 13 commits July 8, 2025 22:35

first attempt with CB

5cdb301

cont-bench testing

3db57c0

cont-bench testing

85cd57c

cont-bench testing

93a093a

test bench setup

d7d8e7a

another test

eeffaa4

another way

68d3733

added detect file change

82fa004

addedArchive Results

963aff2

added Benchmark Documentation

14066f2

all files

d7b8ed4

Update continuous benchmarking results

433caf2

premature version of workflow file

052bb5d

Copilot AI review requested due to automatic review settings July 11, 2025 20:35

Malmahrouqi3 requested review from a team and sbryngelson as code owners July 11, 2025 20:35

qodo-merge-pro bot added Review effort 4/5 Possible security concern labels Jul 11, 2025

Copilot AI reviewed Jul 11, 2025

View reviewed changes

docs/documentation/cont-bench.md Show resolved Hide resolved

bench.json Outdated Show resolved Hide resolved

test-components.sh Outdated Show resolved Hide resolved

test-components.sh Outdated Show resolved Hide resolved

removed junk

4a95eb1

Malmahrouqi3 force-pushed the cont-bench branch from eb6e1f5 to 4a95eb1 Compare July 11, 2025 20:54

sbryngelson marked this pull request as draft July 14, 2025 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Continuous Benchmarking #936

Continuous Benchmarking #936

Uh oh!

Malmahrouqi3 commented Jul 11, 2025 •

edited

Loading

Uh oh!

qodo-merge-pro bot commented Jul 11, 2025

Uh oh!

qodo-merge-pro bot commented Jul 11, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Malmahrouqi3 commented Jul 11, 2025

Uh oh!

sbryngelson commented Jul 12, 2025

Uh oh!

Uh oh!

Continuous Benchmarking #936

Are you sure you want to change the base?

Continuous Benchmarking #936

Uh oh!

Conversation

Malmahrouqi3 commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Description

PR Type

Description

Changes diagram

Changes walkthrough 📝

Uh oh!

qodo-merge-pro bot commented Jul 11, 2025

PR Reviewer Guide 🔍

Uh oh!

qodo-merge-pro bot commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Malmahrouqi3 commented Jul 11, 2025

Uh oh!

sbryngelson commented Jul 12, 2025

Uh oh!

Uh oh!

Malmahrouqi3 commented Jul 11, 2025 •

edited

Loading

qodo-merge-pro bot commented Jul 11, 2025 •

edited

Loading